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We describe a method for lossless quantum compression if the output of the information source is 
not known. We compute the best possible compression rate, minimizing the expected base length 
of the output quantum bit string (the base length of a quantum string is the maximal length in 
the superposition). This complements work by Schumacher and Westmoreland who calculated the 
corresponding rate for minimizing the output's average length. 

Our compressed code words are prefix-free indeterminate-length quantum bit strings which can 
be concatenated in the case of multiple sources. Therefore, we generalize the known theory of prefix- 
free quantum codes to the case where strings have indeterminate length. Moreover, we describe a 
communication model which allows the lossless transmission of the compressed code words. The 
benefit of compression is then the reduction of transmission errors in the presence of noise. 



PACS numbers: 03.67.-a, 03.67.Hk 
I. INTRODUCTION 

One of the main aims of information theory is to de- 
termine the most efficient way to compress messages. 
The solution to this problem often reveals relations to 
entropy-like quantities, as in Shannon's noiseless coding 
theorem [l[ , where the entropy of the information source 
determines the best possible compression rate. 

The situation in quantum information theory is quite 
similar. The most popular example is Schumacher's 
noiseless coding theorem Q , showing that the best possi- 
ble compression rate in the quantum case is given by von 
Neumann entropy. "Compression" here means that the 
number of qubits that have to be transmitted to faithfully 
exchange a quantum state is minimized. This definition 
shows that the compression of quantum information is 
automatically related to the problem of communication: 
once the compression is accomplished, then how can the 
compressed code words be transmitted to a receiver? 

This question addresses an important difficulty in the 
quantum situation which does not arise in classical in- 
formation theory: if a variable-length code is used for 
quantum compression, some code words will be shorter 
than others. But this may result in code words which are 
in a superposition of different lengths — how can those 
code words be transmitted without disturbance? 

This problem is one of several reasons why it was pre- 



*Electronic address: mueller@math.tu-berlin.de also at Institute 
of Mathematics, Berlin Institute of Technology (TU Berlin). 
^Electronic address: caroline@dcs.warwick.ac.uk 
t Electronic address: biju@dcs.warwick.ac.uk 



viously stated [1, 13, IE B 0] that lossless compression 
of an ensemble £ = {PiA^i}} °f quantum states is in 
general impossible, if the value i of the state Ubi) to be 
compressed is unknown. A related objection pj is that 
prefix-free codes are also useless in the quantum situa- 
tion: a prefix-free code word carries its own length infor- 
mation. If it is transmitted over a channel, that length 
information must be read out to see when the transmis- 
sion is over and the channel can be closed. Again, if the 
code word is in a superposition of different lengths, this 
reading-out measurement disturbs the code word. 

In this paper, we show that the aforementioned prob- 
lems do not appear if one uses a channel instead which is 
always open. In this case, there is no need to decide when 
the transmission is finished. Even in the case of such a 
channel, compression can be beneficial: it can help to 
reduce transmission errors. 

To better understand the purpose of this paper, it 
makes sense to think about the compression of quantum 
information as taking place in several steps: 

Step 1. First, the quantum state is compressed, typically 
yielding an output code which is in a superposition 
of different lengths. 

Step 2. Optional: The output code is cut off (projected) 
to get a determinate-length code (which introduces 
some loss). 

Step 3. Finally, the code is transmitted over some quantum 
channel. 

Actually, Schumacher and Westmoreland Q give a 
method of this form for compression of quantum infor- 
mation using a prefix-free quantum code. In fact, Step 
1 in their setting is lossless — it is a unitary and thus 
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reversible operation that minimizes the output's average 
length. 

Then they show that a projection to the first n-(S + 5) 
qubits does not disturb the message very much, where S 
is the source's entropy — this is Step 2 in the scheme 
above, which introduces some (small) loss. As the result- 
ing code word consists of a classically known, determi- 
nate number of qubits, it is clear how to transmit it over 
a channel (Step 3). 

In this paper, we describe how Step 1 can be carried 
out losslcssly if the task is to minimize the output's base 
length instead of the average length (both length notions 
will be discussed in detail below, cf. Definition |TTL2]), and 
we compute the best possible compression rate in terms 
of an entropy-like quantity fThcorcm lVI.4l on pagc [lT))) for 
the case of a single quantum information source. We do 
this using prefix-free quantum bit strings such that code 
words can be concatenated in the case of several sources. 

For this reason, we advance the theory of prefix-free 
quantum strings, by generalizing the definition and re- 
sults of Schumacher and Westmoreland in a natural way. 

Moreover, we explain that Step 3 is unproblcmatic if 
the channel in question is always open, even if Step 2 
is dropped. All in all, this gives a lossless method of 
compression and transmission of quantum information. 
The price we pay for it is that there is no way to see when 
the transmission is finished. Yet, the benefit is that the 
probability of transmission errors can be reduced. 



II. SYNOPSIS 

This paper is organized as follows: 

• In SectionUm we give a brief description of previous 
work on lossless quantum compression. In particu- 
lar, we review the arguments why and in what way 
lossless quantum compression of unknown states 
seems to be impossible. We define indeterminate- 
length quantum bit strings (as used by several au- 
thors before) and give a physical interpretation. 

• In Section IIV( we give a communication model 
which describes a situation where lossless quantum 
compression is possible and useful. In short, we ex- 
plain the model of an "always-open channel" where 
neither Alice nor Bob know when the transmission 
has finished, but both parties benefit from compres- 
sion by reducing transmission errors. 

• Section [V] contains a review of some results on 
prefix- free quantum bit strings, generalizing work 
by Schumacher and Westmoreland @- We also 
give new results which have useful interpretations 
in the framework of our compression scheme. More- 
over, we prove that the concatenation of prefix-free 
indeterminate-length quantum bit strings can in 
principle be implemented physically. 



• Our main result is Theorem I VI. 41 on page [101 It 
states the optimal rate for prefix-free compression 
of the unknown output of a single quantum infor- 
mation source, given the task to minimize the ex- 
pected base length. 

To state the theorem, we define "monotone en- 
tropy" and "sequential projections" and discuss 
some properties that simplify the computation of 
their actual numerical values. 

III. PREVIOUS WORK 

The aim of lossless quantum compression is to com- 
press the unknown output of an ensemble £ = 
{{Pi,\ipi})} of quantum states using a variable-length 
quantum code so that the original state can always 
be retrieved exactly and without error. When the {ipi)^ 
are orthogonal, this is equivalent to lossless classical com- 
pression. The challenge is therefore to encode £ when the 
|^i)'s are non-orthogonal and the code words might have 
indeterminate lengths. 

In this section, we first give a definition of quantum bit 
strings that consist of a superposition of classical strings 
of different lengths. Then we describe previous work on 
how to use such quantum strings for compression, and the 
difficulties that arise in such models. Finally, we outline 
a physical interpretation of these indeterminate-length 
quantum strings. 

A. Indeterminate-Length Quantum Bit Strings 

The strategy of classical variable-length compression 
is to assign short code words C(x) to frequent events 
x (e.g. to frequent symbols in a text in some natural 
language), while rare events are assigned the remaining 
long code words. Trying a similar approach in quantum 
information theory naturally produces code words that 
are superpositions of classical strings of different lengths. 

For example, suppose we have two letters A and B, 
and a classical code C of the form C(A) = and C(B) = 
11. If, as a first naive try, we extend this map unitarily 
to quantum states spanned by \A) and \B), we get for 
example 

f \A) + \B) \ |Q) + |11) 

which does not have a determinate length, since it is 
in a superposition of lengths 1 and 2. It is called an 
indeterminate-length quantum bit string. Such strings 
can formally be defined as follows: 

Definition III.l (Quantum Bit String) A quantum 
state l^) is a quantum bit string (or qubit string,) if it is 
an element of the Fock space ( or string space ) 

oo 

71=0 
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that is, if it can be expressed as a superposition of clas- 
sical bit strings of the form 

H)= E ^i s > 

se{o,i}* 

with a s 6 C and X) s e{o 1}* \ a s\ 2 = 1- 

For convenience, we will sometimes drop the normal- 
ization condition. Moreover, it sometimes makes sense 
to call normal mixed states, i.e. density operators, on 
W{o,i}* qubit strings, too. The reason is that the pre- 
fixes of pure qubit strings can be mixed, which will be 
explained in detail below in Section IVl 

Bostrom and Felbinger Q defined two ways to quantify 
the lengths of indeterminate- length strings. 

Definition III. 2 (Length of Qubit Strings [ij) 

The base length L of an indeterminate-length string = 
Sse{o i}* a s\s) ^ the length of the longest part of its su- 
perposition 

L(ij)) = L \^ a s \s) := max£(s), 

\s£{0,l}> J 

or oo if the maximum does not exist. This can also be 
written as L{ip) = max{^(s) | 7^ 0}. The average 

length I of an indeterminate-length quantum bit string is 
the expectation value of the length 

m=?( E a '\*) \ = E \^\ 2 ^ s ) 

\.^{o,i}* / se{o,i}* 

which may as well be infinite. It can be written as £(ip) = 
(x/)\A\ij)), where A is the length operator, defined by linear 
extension of 

A|*)=*00|s) (se{0,l}*). 

Formally, A is an unbounded self-adjoint operator, de- 
fined on a dense subspace o/7i{ ,i}*- 

If the length of a quantum string is observed, then t gives 
the expected length that is observed and L gives the max- 
imum length that can be observed. However, given an 
unknown indeterminate-length string \ip), neither its av- 
erage length nor its base length can be measured without 
disturbing it. 

B. Can Indeterminate-Length Quantum Strings be 
Used for Coding? 

Various papers [1, 0, S IE 0] have described problems 
in using indeterminate-length strings for lossless quan- 
tum data compression. Braunstein et al. Q pointed out 
three difficulties of data compression with indeterminate- 
length strings. The first is that if the indeterminate- 
length strings are unknown to both the sender and the 



receiver, then it becomes impossible to synchronise the 
different computational paths (taking different numbers 
of time steps) that are performed on the strings. 

The second difficulty is that if a mixture of 
indeterminate-length strings is transmitted at a fixed 
speed, then the recipient can never be sure when a mes- 
sage has arrived and the strings can be decompressed. 
The third difficulty is that if the data compression is 
performed by a read/ write head (like a Turing ma- 
chine), then after the data compression, the head loca- 
tion of the sender is entangled with the "lengths" of the 
indeterminate-length string which represents the com- 
pressed data. 

Koashi and Nobuyuki [|| argued that it is impossible 
to faithfully encode a mixture of non-orthogonal quan- 
tum states if the particular output states of the quantum 
information source are not known. They modelled loss- 
less data compression as taking place in a register of N 
qubits. A compressed state in the register would be an 
unknown indeterminate-length quantum string with base 
length L, in which case, only the remaining N — L qubits 
would be usable by other applications without disturbing 
the compressed state. However the base length L is not 
an observable, thus the other applications cannot deter- 
mine how many qubits are available. Thus the remaining 
N — L qubits are not available for other applications to 
use, unless there is some a priori knowledge about L for 
some reason. 

Schumacher and Westmoreland 0] showed that loss- 
less quantum compression cannot be carried out by 
a unitary operation in a simple model of communica- 
tion. They envisaged that indeterminate-length quantum 
strings would be padded with zeros to create determi- 
nate length strings (we explain this in more detail below 
in Subsection IIII Cjl . They modelled the data compres- 
sion as taking place between two parties Alice and Bob 
in which Alice sends Bob only the original strings (with 
the zero-padding removed) leaving Alice with a number 
of zeros depending on the length of the string she sent. 
If she sends Bob an indeterminate-length string, then af- 
ter the transmission, Alice and Bob are entangled by the 
number of zeros that arc left on Alice's register. 

Bostrom and Felbinger Q argued that it is not useful 
to consider quantum generalizations of classical prefix- 
free codes: classical prefix-free strings carry their own 
length information, but the length information in an 
indeterminate-length quantum string cannot be observed 
without disturbing the string. Their solution to this 
problem is to use a classical side channel to inform the 
receiver where to separate the code words. 

Ahlswede and Cai 0] followed the same idea by send- 
ing the length information over a classical side channel. 
Compared to Ref. they improved the compression 
rate by giving a more efficient way to use the side chan- 
nel, and they characterized the optimal compression rate 
in this setting. We describe both approaches in more 
detail below in Subsection MIDI 

However, in both cases, the use of the classical side 
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channel requires that the sender (Alice) knows the output 
of the quantum information source (at least partially), 
and thus the length of the compressed code word. This 
is in contrast to the situation examined in this paper. 

C. Schumacher and Westmoreland's Prefix-Free 
Average Length Compression 

Schumacher and Westmoreland Q investigated the 
general properties of indeterminate-length strings. An 
indeterminate-length string can be padded with zeros 
such that it consists of a determinate number of qubits. 

Definition III. 3 (Zero-extended form) 

IfW) = 12e(s)<i a s\ s ) * s a quantum string in a register 
of I max qubits, then its zero- extended form is 

\M = a s |s0 8 *—-* (s) >. 
<(«)</,»«, 

This string has a determinate length of l max qubits. 

Given a sequence of N strings, it is useful to be able 
to "concatenate" them so that the strings are packed to- 
gether at the beginning of the string and the zero-padding 
all lies at the end of the sequence. Schumacher and West- 
moreland call this the "condensation operation" . 

Definition III. 4 (Condensable strings 

A code is condensable if for every N , there exists a uni- 
tary operation U such that 

[/(|V4f> ® • • • ® = (l^ 1 ) ® • • • ® \4> N )U 

for all the code words which are "length eigenvec- 
tors", i.e. if each \ip l ) contains in its superposition only 
classical words of some fixed length. 

For example, if \^) = |0), \ip 2 ) = |10) and |i/> 3 ) = |111), 
then the condensation operation U is 

Wzef) ® IV4f> ® IVif» = tf(|ooo>®|ioo>®|in» 

= |0)®|10)®|111}® |000) 

= (|0) ® |10) ® |lll)) ze f. 

Superpositions of classical prefix-free strings are con- 
densable. More generally, Schumacher and Westmore- 
land gave a definition of prefix-free quantum strings and 
showed that they are condensable. According to their 
definition, two strings are prefix-free if when the addi- 
tional qubits in the longer string are traced out, the re- 
sulting prefixes are orthogonal. 

Definition III. 5 (Zero-padded prefix- freedom 0|) 

Suppose \ip) and \<p) are quantum strings with 
n := L(\xj})) < L(\(p)) and that they are in a regis- 
ter of lmax qubits. The first n qubits of \(p ze {) may be in 
a mixed state, described by the density operator 

p 1 -" = Tr n+1 ,...,i m ax(l^ef)(Vzef|)- 



The strings \ip) and \tp) are prefix- free if 

This definition assumes that the two strings have deter- 
minate length. Two of the authors defined prefix-free 
strings more generally such that they can be supported 
on subspaces which are spanned by indcterminatc-lcngth 
quantum bit strings @. We give a review of this more 
general definition in Section [VJ which in fact contains 
Definition IIII.5I as a theorem (Lemma IV.5[) . 

Given many copies £® n of a quantum information 
source £, Schumacher and Westmoreland further showed 
how to use prefix-free quantum bit strings for lossless 
compression (this corresponds to "Step 1" of the com- 
pression process as described in the Introduction) us- 
ing appropriate unitary operations. The indeterminate- 
length output is then projected onto the first n(S(£) + S) 
qubits ( "Step 2" ) , where S is von Neumann entropy. This 
projection (or partial trace) introduces only a small error 
which vanishes in the asymptotic case n — ► oo. 

This can be seen as follows: Let p be the density op- 
erator corresponding to £, with spectral decomposition 

i 

Then £ can be compressed by encoding each |z) as a 
prefix-free string of length |~— log(pj)] with zero-padding. 
p® n can be compressed in the same fashion, by encoding 
every factor individually and applying the condensation 
operation to the resulting code words. Every string 
in the typical subspace of p® n has probability (il)\p\ij)) ar- 
bitrarily close to 2~ nS ^ as n grows large. Thus, vector 
states in the typical subspace of p are encoded in a classi- 
cal manner as strings of length arbitrarily close to nS(p). 
The image of the projection on the first n(S(p)+S) qubits 
thus contains this typical subspace. As with overwhelm- 
ing probability, the output is very close to this subspace, 
it can afterwards be decoded with high (but not perfect) 
fidelity. 

Hence this compression scheme consists of two parts as 
already mentioned in the Introduction: in a first step, the 
quantum message is compressed losslessly, in the sense 
that the output has minimal average length of about 
n ■ S(£). In a second part, some "cut-off" takes place, 
introducing some small error, but preparing the output 
to be transmitted over conventional channels by trans- 
forming it to fixed length Q. 

One of the results of this paper is to show how the first 
step can be accomplished to minimize the expected base 
length (in the case of a single source), thus complement- 
ing the work by Schumacher and Westmoreland. 

D. Compression with Classical Side Channels 

Bostrom and Felbinger 0] gave a scheme for lossless 
quantum compression of known ensemble outputs using 
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classical side channels. If £ = {pi, \ipi}} is the mixture to 
be compressed, then they assume that the value of i is 
known to the compressor. Alice. If she encodes £ using 
a unitary operation C, then she sends the base length of 
the compressed string to Bob, the decompressor, through 
a classical side channel. She then sends L(C(\ipi))) qubits 
of |V^)'s zero-extended form to Bob. Since the length of 
the encoded string is encoded classically, it is not neces- 
sary to use a prefix-free code to encode the quantum part 

— thus C is unitary but not necessarily a condensation 
operation. 

Ahlswede and Cai studied quantum data compres- 
sion with classical side channels in more detail. They 
found an expression for the number of qubits that are 
sent through the quantum channel in Bostrom and Fel- 
binger's lossless quantum compression scheme jij. 

Moreover, they showed by using counterexamples that 
the optimal rate of compression R of a one-one code can- 
not be achieved by a greedy algorithm. However, the 
main goal of Ahlswede and Cai was to find a more effi- 
cient way to use the classical side channel than just to 
report the base lengths. They showed that the quantum 
part could be compressed further than in the scheme set 
out by Bostrom and Fclbinger. 

The basis for their scheme is as follows. If £ is the 
mixture to be compressed, and if there exists some small 
subspace X such that several states \ipi) lie exactly within 
X, then this fact can be reported through the classical 
side channel. Thus the amount of quantum information 
that must be sent through the quantum channel is re- 
duced. They gave an expression for the optimal rate of 
compression in their scheme of an ensemble £ = {pi, \ipi}} 
when the states \ipi) are linearly independent (but not 
necessarily orthogonal). 

Compression with classical side channels has been 
studied in more detail for lossy compression [icj |. 
Hayashi and Keiji [ll[ investigated variable- (but not 
indeterminate-) length universal compression. 

Rallan and Vedral I2J gave another scheme for lossless 
quantum compression with classical side channels which 
does not use zero-extended forms. They envisaged that 
the compressed state would be represented by photons 

- thus using a tertiary alphabet {|0), |1), |#)}, where 
|#) denotes the absence of a photon and marks the end 
of the string. They assumed that Alice has n copies of 
an ensemble £ which she would like to send to Bob. In 
this scheme, Alice only sends Bob the value of n through 
the classical channel. This scheme has a nice physical 
interpretation. 



E. Physical Interpretation of Indeterminate-Length 
Strings 

Bostrom and Felbinger [i[ pointed out that variable- 
length quantum strings can be realised in a quantum sys- 
tem whose particle number is not conserved. Rallan and 
Vedral [l2| described in detail an example system where 



the average length of a string can be interpreted as its 
energy. 

A Hilbert space H® n can be realised by a sequence of 
photons \4>i) <E> ■ ■ ■ <8> \ 4> n ) in which \<j>i) represents exactly 
one photon with frequency uji. The value of the qubit 
\4>i) is realised by the polarisation of its photon, cither 
horizontal |0) or vertical |I). The absence of a photon at 
a particular frequency can be represented by \ jf) which 
is orthogonal to |0) and Indeterminate-length strings 
are obtained by allowing the number of photons to exist 
in superposition and ordering the photons by their fre- 
quencies. The first |#) (which can be in a superposition 
of positions) is used to mark the end of the string. 

The frequency of each photon \<j>i) is chosen to be ap- 
proximately equal so that ioi « u for some value tu. The 
energy in a superposition of photons is the average en- 
ergy required to either create or destroy that superposi- 
tion (hw per photon of frequency lu where h is Planck's 
constant). Thus the energy of an indeterminate-length 
string of photons \<fi) is proportional to its average length 
and is given by (approximately) Sw£(| </>)). 

On the other hand, the base length of \<f>) represents 
the number of photons at different frequencies that are 
used to describe |</>). Thus the base length of \cj)) is the 
size of the system required to carry the state \<p). 



IV. COMMUNICATION MODEL FOR 
LOSSLESS QUANTUM COMPRESSION 

Now we describe a model of a communication chan- 
nel where lossless quantum compression of unknown mix- 
tures is possible and useful. 

The main argument why lossless quantum compression 
of unknown states seems to be impossible is that it is im- 
possible to determine how many qubits to transmit when 
the message is in a superposition of different lengths. If 
Alice has an unknown indeterminate-length qubit string, 
how can she find out when the transmission is finished 
and the channel can be closed? To avoid this problem, 
we look at always-open channels. 

transmission cell 
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> conditional swap 
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Alice 



Bob 



FIG. 1: Schematic of an always-open channel as described in 
Section HVl 



A model of an always-open channel [ljj, [l4[ is shown 
in Fig. [Tj Suppose Alice wants to send Bob a single code 
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word of a quantum prefix code, i.e. an indeterminate- 
length qubit string \ip) which is an element of a prefix-free 
subspace 7i of 7i{o,i}* that Alice and Bob have agreed 
upon in advance. (This single code word might itself be 
a concatenation of several prefix- free code words.) As we 
shall sec later in Theorem I VI. 41 we may assume that Ji is 
spanned by the classical code words of a classical prefix 
code as in Subsection IIII CI above. 

The main part of the channel is a transmission cell 
which carries exactly one qubit. Initially, this qubit is 
set to zero, and so are all the qubits in Bob's memory. 
Moreover, Alice's memory contains a zero-padded form 
of her message string, as described in Definition IIII.3I 

Now we describe the communication protocol — for 
each step, we describe what Alice and Bob are doing in 
the case of classical bits (i.e. in the case that the mes- 
sage qubit string \ip) is just a classical string \s) out of 
the classical prefix- free orthonormal basis of Ti ) , and we 
assume that the resulting operation is linearly extended 
to a unitary operation on the corresponding quantum 
system. The unitarity of the operations at Alice's and 
Bob's side is then assured by the reversibility of the cor- 
responding classical operations. 

At step i of the transmission, Alice swaps the i-th qubit 
of her padded message string with the content of the 
transmission cell. Afterwards, Bob checks if the i — 1 
qubits he has received previously form a valid code word 
or not. (Due to prefix-freedom, if the answer is "yes", 
then the transmission must be over — cf. also Defini- 
tion IIII.5I and Lemma IV.5[) . If the answer is "no" , he 
swaps the i-th qubit of his memory (which is just a zero) 
with the content of the transmission cell; otherwise, he 
does not do anything. That is, Bob applies a conditional 
swap, where the condition is that the transmission is not 
yet finished. 

This way, the message qubit string is transmitted qubit 
by qubit. In the end, Alice ends up with a memory full 
of zeroes, while the transmission cell contains a zero as 
well, and Bob's memory carries the zero-padded message 
string. Hence the entanglement problem described by 
Schumacher and Westmoreland [7| is avoided, and the 
message qubit string is transmitted reversibly and uni- 
tarily from Alice to Bob. 

But what is the advantage of compression for such 
a communication channel, if that channel can never be 
switched off by Alice or Bob? It cannot be used to save 
transmission time (considered as a resource), because 
both parties never know if the transmission is already 
finished or not (unless some predefined maximal trans- 
mission time t max has passed). However, quantum com- 
pression can have other advantages: for example, sup- 
pose the transmission cell is subject to noise during the 
transmission. That is, the transmission of every single 
qubit has an inherent error probability. In this case, Al- 
ice can minimize transmission errors by compressing her 
quantum messages before sending them. 

To be more exact, as soon as the code word has been 
fully transmitted (i.e. at a time step corresponding to the 



message's base length), Bob stops to access the transmis- 
sion cell. Thus, any noise that affects the cell from that 
point on will not disturb the communication any more, 
because the channel to Bob is effectively closed. Thus, 
minimizing the number of qubits to be transmitted re- 
duces he probability of transmission errors, even though 
neither Alice nor Bob know the number of transmitted 
qubits. 

It is clear that the optimal compression method de- 
pends on the kind of noise that the system is exposed 
to. Obviously, in the case that each transmitted qubit 
is independently subject to the same kind of pertur- 
bation, then Schumacher and Westmoreland's average 
length compression method optimally minimizes trans- 
mission errors. But there are other conceivable scenarios: 
for example, we might have several channels at once that 
are subject to the same kind of noise, or time-dependent 
noise that grows with the number of qubits. In this case, 
it is not so clear any more what the best method of com- 
pression is. 

In this paper, we compute the best possible compres- 
sion method and the rate for minimizing the code's ex- 
pected base length. Although we do not currently know 
of a natural noise model where the expected base length 
determines the error probability, it seems likely that there 
are indeed natural situations where this kind of compres- 
sion is superior to average length compression — for ex- 
ample, models like those mentioned in the last paragraph 
where "later" qubits are subject to larger errors than 
"earlier" ones. 



V. PREFIX-FREE QUANTUM BIT STRINGS 

Schumacher and Westmoreland 0] defined prefix-free 
quantum strings in terms of their zero-extended forms 
using the partial trace, see Definition IIII. 51 above. In 
Ref. two of the authors have given another way to 
define prefix-free quantum strings which is more general 
and a more direct generalization of the classical defini- 
tion. It can be shown to contain the definition by Schu- 
macher and Westmoreland as a special case. In this sec- 
tion, we briefly review the definition and basic results on 
prefix-free quantum strings. 

The notion of the prefix of a classical string is closely 
related to the concatenation operation o. Thus, before we 
define prefix-free quantum strings, we first explain how 
to concatenate quantum bit strings. If G T~t{o,i}* 
is any quantum bit string, and s € {0, 1}* is a clas- 
sical bit strings, then we can define \tp) o s by linear 
extension of the classical concatenation: Expand \ip) = 
Exe{o,i}* a x\x), and define 

| ip o s) := \ip) os := a x \x o s) . 
Moreover, if \ip) = X)te{o i}* A I*) * s another qubit string 
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with finite base length, then we set 

|V>°¥>) := \ip) ° \<P) ■ 



te{o,i}* 



This concatenation operation on the quantum strings is 
related to the tensor product: If \ip) is a length eigen- 
state (i.e. an eigenvector of the length operator A), then 
\ip) o \cp) = \ijj)®\<p). However, if \ijj) is not a length eigen- 
state, then the concatenation operation is not always an 
isometry and thus not always physically meaningful [|[. 

We can now define prefix-free sets of quantum strings 
(e.g. prefix-free subspaces of the string space 7i{o,i}*) by 
direct generalization of the classical definition. Although 
there are several a priori possible generalizations, they 
all turn out to be equivalent (for a proof see Ref . [8[ ) . To 
state them, we use the symbol A for the empty string of 
length zero. 

Definition V.l (Prefix- Free Sets of Qubit Strings) 

A set M C 7i{o,i}* of qubit strings is called prefix-free, 
if one of the four following equivalent conditions holds: 

(1) For every \Lp) ) \ip) G M and classical string s G 
{0, 1}* \ {A}, it holds (<p\4> o s) = 0. 

(2) For every \ip),\ijj) G M and qubit string \x) -L |A), 
it holds (f\ip o x) =0. 

(3) For every \tp), \ip) G M and classical strings s,t G 
{0, 1}* with s 7^ t, it holds (tp o t\ij} o s) = 0. 

(4) For every \tp), \ip) G M and qubit strings \x)A T ) G 
7~L{o.i}* with \x) -L |t) , it holds (<p o r\ip o x) = 0. 

The relevant case for quantum compression is that M is 
itself a closed subspace of string space H{ ,i}* ■ To prove 
prefix-freedom of such a subspace, it is sufficient to prove 
this property for an arbitrary orthonormal basis [|| : 

Lemma V.2 A subspace Ti C 7Y{o.i}* is prefix-free if 
and only if it has a prefix-free orthonormal basis. In this 
case, every orthonormal basis ofTL is prefix-free. 

Example V.3 The following subspace H C 7i{o,i}* * s 
prefix-free: 



^(|I) + |01)),-±=(|10)-|010)) 



It is easily checked that condition (1) from Definition \ V.l\ 
above is satisfied for the two orthonormal basis vectors. 

Similarly as in the classical case, closed prefix-free sub- 
spaces obey a Kraft inequality 

Lemma V.4 (Quantum Kraft Inequality) Let 

{|ei)}i S j C ?i{o,i}* oe a prefix-free orthonormal system, 
spanning a closed subspace 7i C 7i{o,i}* ■ Then, it holds 

^ 2 -L( ei ) < J22~^ ei) < Tr(2- A P(W)) < 1, 

iei i£l 



where ¥(H) denotes the orthogonal projector onto 7i. 
Equality holds for the left three terms if and only if every 
\ei) is a length eigenvector. 

Prefix-free subspaces have a remarkable property: ev- 
ery basis vector of length n can be distinguished with 
certainty from every other (even longer) basis vector by 
measuring the first n qubits only. Unfortunately, this 
is only true in general for orthonormal bases of length 
eigenvectors 

Lemma V.5 An orthonormal system M C 7i{o,i}* 
which consists entirely of length eigenvectors is prefix- 
free if and only if for every \<p),\tp) G M with \ip) ^ \tp), 
it holds 



<sl>\v m W) = o, 



(1) 



where (p n denotes the restriction of the quantum state 
\ip)(tp\ to the first n := t{ip) qubits. 

This lemma shows that if the subspace contains an or- 
thonormal basis of length eigenvectors, our definition of 
prefix-freedom is equivalent to the definition by Schu- 
macher and Westmoreland Q- 

We have only collected the basic facts about prefix-free 
quantum bit strings that are relevant for lossless quantum 
compression. For more details, we refer the reader to 
Refs. and @. 

In general, the concatenation operation does not pre- 
serve the norm of vectors from 7i{o,i}*, i-e. it is not 
an isometry and hence not physically meaningful. How- 
ever, we shall now prove that concatenation can be im- 
plemented in principle on a quantum computer (i.e. it is 
an isometry) if one restricts to prefix-free Hilbert spaces: 

Theorem V.6 (Isometry of Concatenation) 

If {|Vi}) 1^2)} C 7i{o,i}* is a prefix-free set, and 
\ipi), \ip 2 ) G H{ ,iy, then 

(ipi o ij)x\<p 2 o ip 2 ) = {<Pi\f2)(ipi\ip2}- 

Consequently, if Ti C 7i{o.i}* is a closed prefix-free sub- 
space, then there exists a unique isometry U : Ti <8> 
^{0,1}* — * ^{0,1}* such that U \ip) ® \ip) — \<p o tp) for 
every \ip) G H and \ip) G W{o,i}*- 

Note that in the special case that Tt is spanned by length 
eigenvectors, the map U a corresponds to the "simple 
condensation operation" as defined by Schumacher and 
Westmoreland 0- 

Proof. It is easy to check that for every pair of qubit 
strings \<pi), 1 9^2) G W{ ,i}* and s G {0, 1}*, we have (<p\o 
s\(f2 o s) = (<pi\<p2)- Now suppose that additionally $ := 
1^2)} is a prefix-free set, and \ip) G W{o,i}* is an 
arbitrary qubit string. Expanding \ip) = X)sg{o 1}* 7sl s )' 
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we have 

(ipi o i/)\(p 2 o tp) = 2J 1 sit (fi ° s\<p 2 o t) 

s,t£{0,l}* 

- J! lsls(<Pl° s\lf 2 s) 
s€{0,l}* 

= (<Pl\<P2) l7s| 2 = < < Pl| < P2)(V#}- 

se{o,i}* 

In (*), we have used the fact that <I> is prefix- free, and so 
(y>i o s\(p2 ot) = if s ^ t. Finally, if \ip 2 ) G W{o,i}* 
are arbitrary qubit strings, then choose an arbitrary or- 
thonormal basis {|ei)}i<=N of H{o,i}* such that \ipi) = 
A|ei) with As t, and expand |t/> 2 ) as |V>2) = J2ieN ai \ e i)- 
It follows 

( =' ai(<pi o Vi|^2 ei) 
= aiA(^i|^ 2 ) (ei|ei) 
=1 

= <^l|V'2>(<^l|^2>- 

In (**), we have again used the fact that $ is prefix-free, 
and consequently (ipx o ipi\<p2 ° e%) = for i > 2, since 
J- |e*>. □ 

We show now that the base length of a concatena- 
tion of two qubit strings is the sum of the individual 
base lengths. Note that this is in general not true for 
average length £: for example, if \ip) = + 1 01) ) 

and \<p) = ^7|(|10) — |010) ) are two vectors from the 
prefix-free Hilbert space Ti in Example IV. 3[ and if \x) '■= 
-^{\ip}+\tp}), then it is easy to check that ^ = £{x°f) > 

%)+%) = 2 + §. 

Lemma V.7 (Additivity of Base Length) 

If \ip), \ip) G 7^{o,i}* are qubit strings with finite base 
lengths, i.e. L((p) < 00 and L(tf>) < 00, then L(ip o ->p) = 
L(v) + L(iP). 

Proof. For every G 7i{o.i}*j define S(ip) := 
{s G {0,1}* I (s\tp) ^ 0}. It follows that L(ip) = 
max{l(s) I s G S(ip)}. If we expand \tp) =: 
E se {o,i}* a ^l s ) and 1^) = : Ete{o,i}* @t\t), then 

\<P° ip) = 2_,a s Pt\sot). 

s.t 

It follows that 

S(<p o^)C S((p) o S(tp) := {s o t I s G S((p), t G 
and thus 

L(y> o ^) = max{£(s) I s G o ip) 

< max{£(s) \ s G S(p) o 5(^)} 
= max{£(s o t) I s G S(y>), i G 

= max £(s) + max £(t) = Lhp) + L(ib). 

s<eS(<p) tes , (v) 



Let now s max and t max be elements of maximal length 
in S(ip) and S(ip) respectively. Clearly, (s max ot max \ipo 
= ^2a s Pt, where the sum is over all s G S(ip) and 
t G S(tp) such that sot = s ma x ^mox- But because 
of the maximum length property of s max and t max , it 
follows that £(s) = £(s m ax) and £(t) = £(t max ), and thus 
s = s max and t = t max . Consequently, (s ma x ° imazl^ ° 
^) = a Sma J tmaa + 0, and Lfoj o ^) > L(^) + L(i/>). □ 

We explain the meaning of these results for lossless quan- 
tum data compression below after Definition IVI.ll the 
definition of a lossless quantum code. 

VI. LOSSLESS QUANTUM DATA 
COMPRESSION 

Our aim is to compute the best possible rate for com- 
pressing the unknown output of a single quantum infor- 
mation source, where the source is given by an ensem- 
ble £ = {pi, of in general non-orthogonal quantum 
states \ipi) with probabilities pi > 0. As motivated in 
the Introduction, we want to minimize the expected base 
length of the code, and we want to use a prefix-free code 
to allow concatenation of code words in the case of several 
sources. 

Definition VI. 1 (Lossless Quantum Code) 

Let £ = {pi, be an ensemble of quantum states in a 

Hilbert space, with Ti := span{|^i), . . . , |i/> n }}. A lossless 
code C is an isometric linear map from Ti into a closed 
prefix-free subspace Ti' C 7i{o.i}* • 

The expected base length of compression of C is 

E(L(C(£)))=J2PiL(C(\i, i ))). (2) 

i 

C is optimal if for any other code C f , 

E(L(C(£)) < E(L(C'(£))). 

The expression (J2J) defines the compression rate of the 
code as the expected base length of the encoding of the 
output of a single instance of the ensemble. What if we 
have n copies £® n of an ensemble £ , i.e. several output 
states are produced independently and identically dis- 
tributed according to £ ? 

Suppose we have two different ensembles £ = {pi, 
and T = {qj, \<fj)} which have optimal codes Cs and Cjf 
respectively. As the codes are prefix-free, we may con- 
catenate them to obtain a code Cs o Cjr for £®T. Theo- 
rem IV. 61 proves that this concatenation can be done uni- 
tarily, i.e. can be implemented in principle on a quantum 
computer, and Lemma TV. 71 tells us that the base lengths 
then just add up. Explicitly, 

E(L(C £ oCr)) = Y.P^L{C £ m)oC^ 3 ))) 

ij 

= E(L{C £ ))+E{L{Cr)). 



9 



Thus CgoCjr is a code for 8® J- with the simple property 
that its rate is just the sum of the rates of the two codes. 
However, it is not necessarily optimal any more. In fact, 
denoting the optimal compression rate of an ensemble £ 
by R(£), Theorem IVI.4I below will show that if e.g. 



£ 



where |0), |1) and \2) are three arbitrary orthonormal vec- 
tors, then R{£) = §, while R(£®£) = f < 2R(8) = f . 
Hence concatenation of codes does not always produce 
optimal codes (although they are typically quite good), 
and our result will for example not give a simple expres- 
sion for the asymptotic rate lim^oo —R(£® n ), only the 
upper bound R{£). 

Yet, the result is nevertheless useful, in particular if 
there is only one output of the source, or if there are 
several sources £\ ® £2 ® ■ ■ ■ ® £ k which are not known in 
advance to the compressor. Then, the compression can 
be done sequentially, for one source after the other, and 
the code words are concatenated while the rates just add 
up. As for the compression rate, we get the useful upper 
bound R < R ( S i) 

even if there is no translation 
invariance in the sequence of sources. 

This subadditivity property of the optimal rate also 
shows that in the case of n copies of one source, block cod- 
ing with concatenation will produce the optimal asymp- 
totic compression rate: write 



£ 



£ 



1 £* 



with 5Z i=1 rik = n such that the sequence (nk)ken is 
increasing. Then, use the optimal code C nk for each block 
£<»n k se p ara tely, and concatenate the codes to get a code 
for £® n . The corresponding compression rate will be 
asymptotically optimal. 

To state the optimal compression rate for single 
sources, we introduce the notion of monotone entropy and 
of a sequential projection of some ensemble £ = {pi, \ipi)}- 



Definition VI. 2 (Monotone Entropy) 

Let p — (pi, J>2j ■ ■ ■ iPn) be a probability vector. Then, we 
define the monotone entropy H mon (p) as 



H mon (p) := min i^Piii 



5>*<i, 



ii < h < ■ ■ ■ < in , ^eN 
(*) 

Note that the Kraft inequality on the right-hand side 
implies that the values {£i}i are code word lengths of a 
prefix code. 

Suppose we removed (*) from the definition. This 
would mean that we look for the smallest possible rate of 
any prefix code for the given probability distribution p. 



As is well-known, this best rate is given by the Shannon 
entropy H(p); thus, we would get back (up to possibly 
one bit) Shannon entropy. This implies 



H mon {p) >H(p), 



(3) 



and justifies that we call H rnon an entropy. Note that 
H mon changes if we permute the entries of p (while Shan- 
non entropy stays constant). If the elements of p are in 
decreasing order, then monotone entropy equals Shannon 
entropy up to possibly one bit: 

Pi >f>2 > ••• >Pn H{p) < H mon (p) < H(p) + 1. 

(4) 

This is easily proved by inserting ii := |~— logp;] . On the 
other hand, if we set ii := [logn] for every i, we get the 
universal upper bound 



H m on{p) < [logn], 



(5) 



if n denotes the number of elements in p. 

Now we explain the notion of a sequential projection. It 
is a certain probability distribution which is constructed 
from £ in a sequential manner. 

Definition VI. 3 (Sequential Projection) Let £ = 

{(Pi, \ipi))}2=i be an ensemble of quantum states. A se- 
quential projection p' = (p[,P2, ■ ■ ■ ,p'k) * s an V probability 
distribution which can be constructed by the following al- 
gorithm: 

• Choose an arbitary integer i\ £ {l,...,n}. Then, 
add up all the probabilities pj that correspond to 
vectors \tpj) which are linearly dependent on (par- 
allel to) to get the value p' 1; i.e. 

h ■= {j 6 {1, • • ■ ,n} I G spanfl^)}} 

(in particular, i\ £ J), andp' x := Y^jei ± Pj- 

• Choose an arbitrary remaining integer 12 € 
{1, . . . , n} \ I\. Add up all the probabilities pj that 
correspond to vectors which are linearly depen- 
dent on \ipi 2 ) and the previously chosen vectors in 
Ii to get the value p' 2 , i- e. 

h ■= {.] S {l,...,n}\/i I 1^) G span {{\^ 2 )} U h)} 

andp' 2 := Y,jzi 3 Pj- 

• Choose an arbitrary remaining integer 13 G 
{1, . . . ,71} \ {1 1 U 12). Add up all the probabilities 
Pj that correspond to vectors \ipj) which are lin- 
early dependent on \tpi 3 ) and the previously chosen 
vectors to get the value p' 3 , i.e. 

h ■= {iG{l,... ! n}\(I 1 U7 2 ) I 

Gspan({|Vi 3 )}UJiU/ 2 )} 

andp' 3 := EjehPi- 
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• Iterate these steps until there are no remaining vec- 
tors in the ensemble. 

As an example of a sequential projection, consider the 
states from an ensemble {pi, \ipi)}i = i 



IV'O = |o>, life) = |+> 

iv> 3 > = ii>, m = 12), 



10) 



ID 



V2 



where |0), |1) and |2) denote orthonormal basis vectors 
from an arbitrary Hilbert space. Then, applying the def- 
inition above, and noting that ^3) is in the span of 
and 1^2)) we get one possible sequential projection as 

Pi =Pl, P2=P2+P3, P3 = Pi, 

where we have chosen %\ = 1, 12 = 2 and 13 = 4. Other 
choices of indices yield different sequential projections. 
That is, to every ensemble £, there are several possible se- 
quential projections of £ . By combinatorics, the number 
of sequential projections to an ensemble £ of n elements 
is upper-bounded by n\. Each sequential projection is a 
probability vector with dimf elements. 

To get an idea how sequential projections are related 
to base length compression, suppose a code Q compresses 
the state |0) with base length l\ and the state |+) with 
base length I2 > h, then, since |1) is on the span of 
|0) and |+), |1) will typically also be compressed to 
Suppose 1 2) is compressed to length I3 (which can safely 
be achieved if I3 > I2), then the compression rate of £ is 

E(L{Q(£))) = pih + {p2+Pz)h+Pih = Pih+P^h+p'^h- 

We can now state the optimal rate of compression in 
terms of monotone entropy and sequential projection, 
which can both be calculated combinatorially. 

Theorem VI. 4 (Optimal Compression Rate) 

Let £ = {pt, \ ipi)} be an ensemble of quantum states in 
some Hilbert space. Then the optimal base length lossless 
quantum prefix compression code C can be constructed 
such that it maps into a Hilbert space Ti! which is spanned 
by an orthonormal basis of length eigenvectors. The rate 
R of this optimal code is given by 

R = m.hi{H mon (jp') I p' is a sequential projection of £}. 

In particular, if the vectors {\ipi)}i are linearly indepen- 
dent, then H{p) < R < H(p) + 1, i.e. the rate is es- 
sentially given by the Shannon entropy of £ 's probabil- 
ity distribution. In any case, we have the upper bound 
R < H(p) + 1. 

Before we give a proof, we illustrate the theorem with one 
example. Suppose our ensemble consists of eight states 
IV^), ■ ■ ■ 1 IV's) from some Hilbert space, each with 
probability p\ = p% = . . . = Ps = § , such that the span 
of those eight states has dimension four. Furthermore, 
suppose that any four of those states are linearly inde- 
pendent. 



Our theorem tells us that we can compress the ensem- 
ble at least as good as R < H(p) + 1 = H (|, |, . . . , |) + 
1 = 4, but we can do better than that. To compute the 
optimal compression rate, we have to look at all possible 
sequential projections. 

We construct a sequential projection p': first, we arbi- 
trarily choose one of the vectors, say, \ipi). As there is 
no other vector which is linearly dependent on (parallel 
to) the first entry to p' is p[ := p\ = \. 

As the second step, we choose one of the remaining 
vectors, say, \4>2)- We have to check if there are any re- 
maining vectors that are in the span of IV'i) and \ip2) (i-e. 
linearly dependent on those two) , which is by assumption 
not the case. Thus, we get p' 2 := P2 = f • 

We go on by choosing the next vector arbitrarily, say, 
\1Jj3). As there arc no remaining vectors in the span of 
|^2) and \tp 3 ), we also get p' 3 := p 3 = ±. 

Then we select another remaining vector, say, IV^)- 
But now, all the remaining vector IV'5), {ipe}, |Vv) an d 
\tps), are in the linear span of j^i), IV^), | ^3) and \tf>4). 
Thus, we have to add the corresponding probabilities to 
get p' 4 := pi +p 5 +p 6 +p r +p 8 = |. Thus, 



P 
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In this example, repeating the process with different 
choices of vectors will always result in the same proba- 
bility distribution p' . Thus, in this case, there is only one 
possible sequential projection of £ which is given above. 
We get the rate R by computing R — H mon (p'). First, 
we know from ([5|) that H mon {p') < [log 4] =2. In fact, 
with the help of a little computer program, it is easy to 
see that the minimum in the definition of H mon is indeed 
attained at this value, that is, 



R — H „ 



Now we prove this theorem. 

Proof. The proof consists of two parts: first, we 
show that a rate of R is achievable, then we show 
that this rate is optimal. We shall denote our en- 
semble by £ = {p(\ipi)), |V'i)}r=ij an d we write H := 

= --Pi 

spanjl'i/'i), . . . , |"0n)}. For sequential projections, we use 
the nomenclature from Definition IVI.3I 

To see the achievability, let p' = (p^, . . . ,p' d ) be an 
arbitrary sequential projection of £ . Let (ci, . . . , c ( i) C 
{0, 1}* be a prefix code with code word lengths ti := £(ci) 
which arc minimizcrs in the definition of H mon (p') (such 
a code exists due to the Kraft inequality). Let Ti! := 
span{|ci), . . . , \cd)} C 7i{o,i}*- We will now construct 
a code (a linear isometric map) C :?{—>?{'. For i € 
{1, . . . , d}, let be the set of vectors from £ that have 
been chosen in step i of the construction of p' ', such that 
n *j ■ = for i ^ j, \4=i*i = {|V>i}, ■ •• , \1> n )}, and 

We start by specifying the action of C on the vectors 
of All the vectors in ^1 are equal up to some phase 
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factor, i.e. they are equal to e \ip), where G We 
set 



C|V) := |ci) 



(G) 



then L(C(\ip))) = £(ci) for every G Since 
dim span (^/i U VE^) = 2, we can construct C such that 



C (span (*! Ufa)) = span{|ci), |c 2 )} 



(7) 



isometrically, while still respecting Consequently, 
L{C(\i>))) = max{€(ci),^(c 2 )} = £(c 2 ) for every |V>) G * 2 
(here we use the monotonicity property £{ci) < £(c 2 ) < 
. . .). The next step is to demand 

C (span (1-i U* 2 U * 3 )) = span{|ci), |c 2 ), |c 3 )} 

isometrically, while respecting ((6|) and ([7]). Iterating 
this process, we obtain a code C in the sense of Defi- 
nition IVI.ll The expected base length compression rate 



n d I 



. . . and so on. Moreover, these code word lengths satisfy 
the Kraft inequality. To sec this, note that the vectors 
\ipi h ) are linearly independent and span the Hubert space 
Tt' . Let {|<^fc)}fe be the orthonormal basis of Tt' which is 
generated by the Gram-Schmidt orthonormalization pro- 
cess from the basis {|'0i fc )}fc- It follows that 

L(\<p k )) < maxL(C(|W») = L(C{\i> lk ))) = l k . 

Since Tt' is a prefix Hilbert space, the quantum Kraft 
inequality from Lemma IV.4I yields 



53 2-'* <Y^^ L(Wk)) < i. 



(8) 



J^P'A = H m on( P '). 



Next, we show that this code is optimal, i.e. no lossless 
code can beat monotone entropy. Thus, let C : Tt — > Tt' 
be a code in the sense of Defjnition lVI.il and let r(C) := 
PiL(C(\ipi))) be the corresponding compression rate. 
We may assume that the vectors are ordered such 
that i < j => L{C{\^))) < L(C(\^))). 

We will now construct a sequential projection p' which 
corresponds to this code C. Let i\ := 1, and h := 



Moreover, p' = (p^p^ . . .) is by construction a sequential 
projection. Hence 

n 

r(C) = 5>L(C(|^,))) = 5>*fe > if^onCp'), 

3=1 k 

which concludes the optimality part of the proof. An 
easy additional argument shows that the optimal code 
Hilbert space Tt' may always be chosen to be spanned by 
an orthonormal basis of length eigenstates: Due to j8|), 
there is a classical prefix- free code {ck}k with £(ck) = 
L(\<fk})- Let Tt" := span fe |cfe), then Tt" is prefix-free. 
Let U\ipk) ■= \ck), then U maps Tt' unitarily onto Tt" ■ 
Hence, the composition U o C is a lossless quantum code. 
Suppose j G Ik, then \ipj) G span fc , <fc | ,), hence 

C/ o CdV'j)) G span fc ,< fe £/ o C(|Vi fe ,>) = span fe ,< fc l% fc ,), 

and so 



L(C(\^ n ))). Suppose |^) G h, then |^) is lin- Q C (U.))) < max L(C/|^)) = maxf(c,) 

carlv dependent on \ibi, ). Since C is isometric, C(w-i)) k'<k k'<k 

&x.L(\ip k ,)) < max£(lVv>) 



early dependent on Since C is isometric, Cd^j)) 

must be linearly dependent on Cd^)) as well, and so 
L(C(\1>i))) = h. So 



= max 

k'<k 



= max lk 

k'<k 



k'<k 

l k ' = lk = L(C(\^))). 



5> 3 -£(C(|^») = 5>i l i= P ' 1 h- 



Then, let i 2 be the smallest natural number which is not 
in ii, and let l 2 := L(C(\ip i2 ))). If | Vj> G h, then |^) is 
in the linear span of I\ and |^>i 2 ). Since C is isometric, 
we can again conclude that L(C(\ipj))) < ^2- But if we 
had L{C{\ij)j))) <h = L(C(\ip i2 ))), then it would follows 
that j < i 2 which is impossible. Hence L(C(\ipj))) = l 2 , 
and 



J2pjL(c(\^)))= 53 



pj 



p 2 i 2 



We iterate this procedure until all the vectors from the 
ensemble have been used. Since the vectors \ipi) are or- 
dered according to their lengths, we have h < l 2 < h < 



Thus, r(U o C) < r(C), i.e. U o C compresses at least as 
good as C. 

In the special case that all the vectors arc linearly 
independent, the sequential projections of £ are exactly 
the permutations of the probability distribution p. Using 
© and (g]), we thus get 

R = min H mon (p) = min _ H mon (a(p)) 
P a permutation 

G [H(p),H(p) + l]. 

It remains to prove that the optimal rate is always 
bounded above by H(p) + 1. For this purpose, re- 
arrange the vectors in decreasing order such that 
Pi > P2 > ■ ■ ■ > Pn- Let p' be the sequential projection 
which is constructed by getting through the list of IV^'s 
in that order. As before, denote by the set of IV^'s 
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that have been collected in step j of the construction of 
p'. Let 

ti : = [-log max p{\ip))]. 

By construction, l\ < £2 < ■ ■ ■ < id, and the Kraft in- 
equality holds for the £i. Thus, 

d 

R < H mon {p') <Y,Pi l i 

i=l 

= E ( E r- log max p{\m 

i=i \|V>e*« / 1 ; 

d 

< E E p(i^»r-iogp(i^»i<ff(p)+i. 

»=1 

This proves the statement of the theorem. □ 



VII. CONCLUSIONS 

We have given a method for lossless compression of 
unknown outputs of single quantum information sources 
which minimizes the code's expected base length, and 
we have calculated the corresponding optimal compres- 
sion rate (Theorem IVI.4[) . Moreover, we have explained 
a simple model of an always-open channel which admits 
the lossless transmission of the indeterminate-length code 
words, and we have explained that compression can re- 
duce transmission errors for those channels. 

As our approach quantifies the rate in terms of the base 
length, it complements work by Schumacher and West- 
moreland Q who have given the optimal rate for aver- 
age length compression. Furthermore, we have demon- 
strated how to apply the theory of prefix-free subspaces 
to quantum information. In short, prefix-free quantum 
strings allow sequential compression in the case of sev- 
eral quantum information sources by concatenating the 
corresponding code words. The concatenation can be ac- 
complished physically (Theorem IV.6[) . even in the case 
of prefix-free subspaces which are more general then 
in Schumacher and Westmoreland's sense (cf. Exam- 
ple EH . 

At this point, it remains open if there is a simple 
formula for the optimal asymptotic compression rate 
linin^oo -^R(£® n ) in the case of n copies of a single source 
£, apart from the upper bound R{£). Also, it would be 



nice to have an example of a physical situation where base 
length compression is better suited to reduce transmis- 
sion errors for channels than average length compression 
(cf . Section IIV[) . Even though the optimal asymptotic 
compression rate is not given in this paper, the result 
is optimal for the case of a sequence of several sources 
£1 ® £2 ® ■ ■ ■ ® £k which are not known in advance and 
have to be compressed sequentially. 

Many open questions in quantum information theory, 
such as entanglement catalysis [l5[ , are phrased as "How 
can this state be transformed into that state exactly and 
without error subject to these conditions?". Perhaps 
lossless quantum base length compression can be applied 
to some of these questions. Bostrom and Fclbinger Q 
stated that lossless quantum compression may also have 
applications in cryptography. Perhaps it can be used to 
minimise the probability that an eavesdropper discovers 
any information at all, rather than the average informa- 
tion that the eavesdropper discovers [Til ]. 

Another possible connection to existing work is in the 
definition of quantum Kolmogorov complexity by Bcrthi- 
aume et al. [16[ . They define the complexity of a quan- 
tum bit string as the length of its shortest determinate- 
length description. Therefore we might expect there to 
be a close correlation between this kind of complexity 
and the rate of compression described in this paper, in 
the same way that there is a close correlation between 
classical Kolmogorov complexity and Shannon entropy. 

Apart from possible applications, one purpose of this 
paper was to show that prefix-free quantum bit strings 
are a mathematical structure with nice properties that 
can be useful in quantum information theory. It might 
be interesting to study them in more detail, in particular 
in connection to possible quantum versions of algorithmic 
probability. 
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